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1. Introduction 



Consider a game {ft,F,R} where ft is a sample space, R is a 
set of prizes and F is a set of actions or decision functions F(*) 
mapping ft into R. In a statistical decision problem we may have, 
following the notation of Ferguson [4], ft = 0xX or ft = X where 0 
is a parameter space and X a sample space for a random variable whose 
distribution may depend on 0e0. When confronted with such a decision 
problem, a rational decision maker will seek to specify a preference 
ordering on the prizes. If the state of nature is known with certainty, 
the decision maker will attempt to choose an action in F which yields 
the most preferred prize. The problem is that, in most cases, the 
decision maker does not operate in a risk-free environment. Instead, 
decisions must usually be made in the face of uncertainty about the 
state of nature. It is the objective of the decision maker to use 
whatever knowledge he has about the states of nature and the resulting 
consequences of his possible actions to select the most desirable 
alternative available to him. 

What analytical tools are available to a decision maker to help 
him make rational decisions in the face of uncertainty? First, let us 
look at the case where the decision maker knows the probability distri- 
bution over the states of nature as he would, for example, if weft 
were selected as the result of a gamble such as drawing a card, rolling 
dice or spinning a roulette wheel. Let us denote this probability by 
P. If the set of prizes R is reasonably rich and the decision maker’s 
preference ordering satisfies certain reasonable restrictions, a funda- 
mental result of utility theory (see von Neumann and Morgens tern [8]) 



guarantees that a rational decision maker should behave as if he had 
assigned a numerical measure (utility) u over R* , the class of 
distributions over R, and that he would prefer an action F^F 
yielding the prize with largest expected utility (if one exists) : 



Thus, in order to evaluate a rule F, a rational person should ascertain 
the values, to himself, of the various prizes; he should weigh those 
values with the probabilities that the prizes will be received using F. 

Now, let us consider the case in which the probability distribution 
of the states of nature is not known by the decision maker. Subjectivists 
would have the decision maker utilize the available information about the 
states of nature and the context of the problem to "personally" assess 
the probabilities. He then simply uses his subjective distribution, in 
lieu of the unknown probability distribution P, in the manner described 
above . 



maker establish a preference ordering over P* , the set of randomized 
decision rules (probability distributions over V ~ {uoF: FeF}) , satis- 

fying certain reasonable restrictions [4], The optimal decision rule 



the highest preference rank. 

In this paper we show that these two approaches for the case of 
decision making under uncertainty are equivalent. More precisely, we 
show that if the decision maker’s preference pattern over P* is 




A second approach to this problem would be to have the decision 



D^eP* would then be determined by selecting a member (if any) of P* with 
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appropriately related to his preference pattern over R*, then his 
preferences on P* agree with a utility function U on P*, and there 
exists a probability measure P such that the U utility of a degenerate 
element of V* (i.e., an element of V) is the expected value of the 
utilities u on R* with respect to the probability measure P. 
Mathematically, if D is an element of V* degenerate at uoF, 

U(D) = | u(F(oj) ) dP ( oj) 

a 

This means simply that the rationality criteria of utility theory are 
such that the decision maker is forced to act as if he knows the distri- 
bution over ft and a utility scaling of the consequences, and an optimum 
decision is one maximizing the expected utility with respect to that 
distribution. Although the decision maker may not explicitly state the 
"subjective probability measure 11 P, such a distribution is implicit 
from his utility assignments. In the statistical decision problem with 
ft = 0xX, the marginal distribution over 0 is the "prior" distribution 
of the states of nature. 

This result is not surprising, for the axioms of utility theory 
used as guides for consistency in judgment in ranking preferences impute 
to the decision maker the ability of making arbitrarily find discrim- 
inations in judgment. Intuitively, it is reasonable that his subjective 
probability distribution over ft is induced by his utility scaling of 
the alternatives and the consequences. Suppose ft = * • • • K and 

r T and r w are elements of R* such that 0 = u(r T ) £ u(r) £ u(r__) = 1 
L M L M 

for all reR*. Let be such that D^Coj^) = 1 and (co^) = 0 
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for all i ^ j. Then, it seems plausible that the utility value U(Dj) 
is (up to normalization) the decision maker’s subjective probability that 
the state of nature is uk . For example, take ft = {u^,^} and suppose 
that D is such that DCrn^) = 1 and DCu^) *= 0 ("heads" = pays 

$1 and "tails" *» pays $0). If the decision maker’s utility 

function assigns D the value 0.3 (using that utility function nor- 
malized over the interval 10,1]) it would not be surprising to discover 
that P({o) 1 )) = 0.3. 

That a decision maker’s utility function over V is an expectation 
of the utilities of the prizes with respect to a probability distribution, 
which we call his subjective distribution, is not a new result. In fact, 
results of this nature can be found in many references: see, for example, 

Ferguson 14], DeGroot [3], Fine [5], Fishbum [6] and Anscombe and Aumann 
J[1J. The approach referenced in the literature is, nevertheless, un- 
necessarily restrictive. For example, the published results apply only 
to the case where ft is finite and where the assumptions connecting the 
two preference patterns are much stronger than required. 

In this paper we provide a rather simple development which relaxes 
the assumptions found in the literature. In fact, after the appropriate 
machinery is established, we show that the result is a straightforward 
application of a powerful theorem of mathematical analysis. 

We devote the following section to the development of a mathematical 
structure leading to a general statement of existence of a subjective 
probability measure. In Section 3 we discuss how the probability measure 
may actually be constructed. In Sections 4 and 5, we illustrate such 
constructions with examples, which suggest applications. 



4 



2 . Development 

We take as given a set of axioms of utility theory such as those 

& 

found in Ferguson {4, pp. 11-20]. Let R be a set of prizes and R 

a class of distributions over R, so that R is closed under convex 

* 

linear combinations (that is, r^ and r^eR imply that ar^ + 

(l-oO^eR for 0 £ a ^ 1) • We assume that all degenerate probability 

* * 

distributions belong to R , so R is embedded in R • Assume the 

decision maker has a preference ordering on R satisfying conditions 

which guarantee the existence of a utility function on R . Let u be 

the unique utility function which maps R onto the interval [0,1] . 

(This is possible provided there is a least desirable prize r and a 

j_i 

most desirable prize r^ in R) . 

Let 9, be a locally compact Hausdorff space, and let F be the 

* 

class of continuous functions mapping 0, into R with compact support. 
Consider the class of functions 



V = {uoF: FeF}. 



We denote the elements of V by D . We illustrate the relationships 

r 

of these mappings (the mapping U: V -* [0,1] is discussed later) in 

Figure 1. With the u utility induced by a preference ordering, prizes 
* 

in R are determined up to indifference by their utilities. Therefore, 

* 

we partition R into equivalence classes of prizes according to their 
utilities. We define the equivalence class = {r: u(r) = a} and 

the family C = {R a * ael0,l]}. This establishes a one-to-one corre- 
spondence between the interval [0,1] and C. Without loss of generality, 
in the remainder we identify R with C. 
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0 



1 



FIGURE 1 

Lemma 1: V is the class of all continuous functions from S3 into 



10,1] which have compact 


support. 


Proof: Define a function 


* 

p on R as follows : 


p ( r i» r 2 ) = 


|u(r ]L ) - u(r 2 ) | (r 1 ,r 2 eR ) 


It is easily seen that p 


* 

is a metric on R and that, with the topology 



induced by this metric, u is a continuous function from R to [0,1]. 
Being compositions of continuous functions, the elements of V are 
themselves continuous. Further, it is clear that the support of Dp 
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and the support of F are identical. For example, let A = {oj: D (uj) > 0} 

r 

and B = {to : F(w) 4 r }. Then, u(F(co)) = D_,(co) > 0 implies F(o)> 4 r T 

Li t L 

which, in turn, implies that oaeB. Thus Pa B. Conversely, if weB , then 
F(co) > 0 and D^(oi) >0 so oaeA. Thus, B = A. Since the support of 
F is compact, D is a continuous function from Q, to 10,1] with 

r 

compact support. 

On the other hand, let D be any continuous function from Q, into 
[0,1] with compact support. Since u 1 is continuous, u ^oDeF, 
and D is a member of V . [] 

We now show that V is sufficiently rich to support a utility 

A 

function, that is, V = V . 

Lemma 2 : V is closed under convex combinations. 

Proof: Let and D£ be members of V and 0 < A < 1. Then AD^ 

is continuous and the support of AD^ is the same as the support of 
so XB^V. Similarly (l-A)D 2 eP. Then D = AD 1 + (1-A)D 2 , being the 
sum of two continuous functions, is continuous, and the support of D is 
the union of the support of and the support of D 2 . Thus, the support 

of D is compact and DeP. The case for A = 0 or A = 1 is trivial. [] 

We now assume that the decision maker has a preference ordering 
over V which determines a corresponding utility function U. Beyond 
the requirements imposed on U by the utility axioms, we require only 
that U be bounded on V> say |u(D) | £ M for all Del?, and that 
U(Dq) = 0, where Dq(oj) h 0 for all weQ. Later, we discuss how to 
normalize U appropriately. 
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Recall our remark that specification of U appears to determine 
a probability distribution such that the decision maker behaves as if he 
were taking expectations of the utilities of the prizes with respect to 
that distribution. Mathematical analysis gives consideration to such 
representation of linear functionals as integrals (expectations) with 
respect to certain measures (probabilities). In a consistent utility ap- 
plication, the utility of a given decision is 



U(D p ) 



D F (03)dP(03) 

a 



when the probability measure P over ft is known. 

We wish to show the converse; that is, if U is a utility function 
defined over the class V> then there exists a probability measure P 
such that U(Dp) is the expectation of D^, with respect to P. The 
Riesz representation theorem guarantees this converse is indeed true under 
certain conditions. 

Theorem 1; (Riesz Representation Theorem) 

Let ft be a locally compact Hausdorff space, and let be a 

positive linear functional on the class V ^ of real-valued continuous 
functions on ft with compact support. Then there exists a a-algebra 
M over ft which contains the Borel sets in ft, and there exists a 
unique positive measure y on M which represents in the sense that 

(a) U e (D) = | D(w)dp(co) (VDeVJ 

n 
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(b) y(K) < 00 for every compact set KQ3. 

(c) y(E) = sup {y(K) : K K compact} 
for every open set E. 

Proof: See Rudin [7, pp. 40-46], 



In order to apply this theorem in our problem we must verify that, 

with appropriate extensions, the conditions of the theorem are met. The 

only difficulty with applying the theorem directly is that U is not a 

linear functional over P. We are therefore required to extend P to a 

vector space P and U to a linear functional U over P . 

e e e 

Toward that end, let V ^ be the linear manifold generated by P 

and define U over P as follows: 
e e 

For D- , D- , . . . , D in P and scalars a,, ou , . .., a , 

1 2 n 12 n 

V j “iV ■ j, “i U( V (4) 

1=1 1=1 

With these extensions we now assert: 



Lemma 3: The linear manifold P is the vector space of all continuous 

e 

functions from ft into the reals with compact support, C^Cft) , and the 
mapping defined by (4) is a positive linear functional over 



Proof : (a) That 

a linear manifold. 



P is a vector space follows from the fact that it is 

n 

Suppose DeP . Then D = Y a. D. for some scalars 
e >,11 

i=l 



{ou} and functions {D^}. 

Since scalar multiples of continuous functions and sums of continuous func- 
tions are continuous, it is clear that D is continuous. Let S^ be the 

n 

support of and S the support of D. Then, clearly, Since 
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n 

S is closed and a subset of a compact set , S is itself compact. 

Thus, D has compact support and DcC (ft) so that D ^ (ft). 

c e c 

Now let GeC^(ft) and Gf*" and G be nonnegative functions such 

that G = G~*" - G . Both G + and G are in C (ft) . Let If*’ and M~ 

c 

be such that M*" = max G^(co) and M” = max G“(to) , and define D + and 

ft ft 

D” as follows (if M + or M are zero, the result follows with slight 
modification) : 



tn- 4- / \ G"*~ ( Gd) a x G (cl )) 

D + (cj) = — and D (co) = — — . 

Then D 1- and D“ are in V and G = M+D+ - This implies that 

C (ft) c V and, therefore, C (ft) = V . 
c e c e 

(b) In order to show U is a linear functional on V , let a, b be 

e e 

scalars and D. , B n eV . Then 
1 2 e 



U (aD+bD 0 ) 
e 1 Z 



where = Ja-.D-. ; D 0 

1 u li li 2 

ou . are scalars for i = 

2i 



‘ ^21 D 2i 



1 , 2 , 



l“li D li) + b( i Il“2i D 2i )> 

; and ®2± £ ^ anc * 

, n . But 



a, b, a 1± . 



U (aD +bD ) = l ( 
i=l 



aa li U(D li ) + ba 2i U(D 2i )) = aU e (D l ) + b W 



(c) We now show that is positive. Let D = be nonnegative, 

i.e., for all coeft, D(oi) ^ 0. Since D^, , . .., are in P, it 

cannot be the case that ou < 0 for all i. Thus, one of the following 
cases must hold: 
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(i) 


a. = 0 for all 

l 


i 




(ii) 


a. > 0 for all 

l 


i 




(iii) 


a ^ 0 for some 


i 


Case 


(1) : D = 0 


and is in P, 


so 


Case 


(ii): U 0 (D) 


= l a U(D ) * 
i 


0 


Case 


(iii) : Let 


b = J a. 

U: a i >0} 


i’ 


nonnegative and 


D'(u)) = y a i D. 

i T x 


(to) 


u (D) 

o 


- bU(D’) 3: 


o. I] 





and < 0 for some i. 

U e (D) = U (D) = 0. 
since U(D^) ^ 0 for all i. 

Define D f = (l/b)D. Since D 1 

^ 1, we note that D’eP. Thus 



is 



We now cast the Riesz Representation Theorem for a utility problem. 
Since U agrees with on V, in particular, the Riesz Representation 
Theorem gives an integral representation of U, 



Corollary 1 : There exists a a-algebra M over ft which contains the 

Borel sets in ft, and there exists a unique probability measure P on 
M such that for all DeP 



U (D) 



D(u)) dP(co) . 

ft 



Proof : We must show that, with appropriate normalization of the utility 

function U, the induced measure y in Theorem 1 is a probability measure. 

Let K be an arbitrary compact subset of ft. Then by Urysohn’s lemma 

(see, for example, Rudin [7, p. 39]), there exists a continuous real-valued 
£ 

function D on ft which is identically 1 on K and for which the support 

K K 

of D , say S, is compact. Thus D eV and 

U e (D K ) = U(D K ) = | D K (cj) dy(o)) = J dy(w) + j D K (<d) dy (w) . 

ft K S-K 
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Now, since U is bounded by M we have that 
and, consequently 



dy(iA)) + 



D K (o)) dy (co) ^ M, 



K 



S-K 



y(K) 



dy(a)) <> M. 
K 



By Theorem 1, y(ft) = sup (y(K): Kcft, K compact}. Since y (K) ^ M for 

all compact sets K we have that y(£2) £ M. 

Let us select that utility function U ! on V which is equivalent 

to U (up to a linear transformation) such that U f (D) = — ttt-U(D). 

y ) 

Theorem 1 now implies 



U'(D) 



1 

y(n) 



U(D) = 



y (n) 



a 



D(to)dy ( oj) . 



On taking P(E) = 



for all EeM, 



we get 



U'(D) 



| D(w)dP(co) 
Q 



(VDeV) 



where P is a probability measure on M. [] 



Observe that if F^ and F^eF differ only on a measurable set 
A on which u(F^(co)) £ uCF^Cm)) and = uoF^ and = uoF^* then 
D^, and 



U(D 2 ) - U(Dp = 



D 2 (u))dP(o)) - 



D^(m)dP(a)) 



[u(F 2 (oj)) - u(F^(to) ) ]dP (id) s 0. 



Thus the two utility functions, u and U, must be monotonically related. 
We state this "monotone property" as 
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Corollary 2 ; If and F 2 e F differ only on a measurable set A on 

which u(F^(a))) ^ u(F 2 (o))) , then U(uoF^) £ U(uoF 2 ). 

We also note that if ft is compact, F contains all constant 
* 

functions from ft to R . Consider the function F w eF such that 

M 

_ * 

F w (o)) = r w , where r__ is the most desirable prize in R . Now if 
M M M • 

F is any other function differing only on a measurable set A, we have 

u(F(o))) £ uCF-Xto)) (Vcoeft) 

and, by the monotone property, 

U(uoF) <; U (uoF ) . 

rl 



That is, the decision function which yields the prize with the greatest 
utility for all outcomes in ft must have maximum U utility. Thus, for 
ft compact, we may normalize the U-utility function so that U(uoF^) = 1. 
Then, 



1 - U(uoF m ) 



uoF..(o))dy (w) 

ft 



dy (id) = y(fl) , 

Q, 



and no further normalization of the measure y is required to guarantee 
that y is a probability measure. The significance of this observation 
is that, for ft compact, U may be normalized directly (before 
application of Theorem 1) in terms of its value at one point in V . 

3. Construction of the Probability Measure 

A useful by-product of our approach to showing the existence of 
the probability measure P is that the Riesz Representation Theorem also 
shows how the measure y is constructed. Following Rudin, define "D ^V n 



to mean 
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a) V is an open subset of ft 



b) DeV 

e 

c) 0 ^ DCoj) £ 1 for all weft 

d) The support of D lies in V. 

For each open set Vcft, the proof of the Riesz Representation Theorem 
shows that 

y(V) = sup {U (D) : D ^ V} 

V e 



Further, for any EeM, define 

y(E) = inf {y(V): EcV, V open}. 

In our application, the set of all Det^ such that 0 £ D(<d) £ 1 for 
all coefi is exactly the set V. Also, for DeV, U e (D) = U(D) so we 
have 



Corollary 3 : For each open set Vcft, 

y (V) = sup {U(D) : S n cV} 

V ° 

where is the support of D. For any EeM 

y (E) = inf {y (V) : &3J , V open}. 

This is important because it allows us to determine y on M 
without extending to V or to the linear functional U e> For any 
AeM, define P(A) = ^ ^ . Then P: M [0,1] is the probability 

measure determined by the decision maker. 
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Corollary 4 : With ft countable and the discrete topology, y(V) = U(D^) 

for every Vcft, where is the indicator function of the set V; i.e.. 



DyC(i)) 



1 if weV 
0 otherwise 



Proof : With the discrete topology over ft, D^el? (every function from 

ft into 10,1] is continuous). Let D be any element of V such that 
(the support of D) is contained in V. Then, by Corollary 2, 

LJ CD) <> U(Dy)* (Indeed, by direct computation, 



U(D) = 



D(a))dy (to) = 



D(a))dy(cj) <> 



ldy (gd) 



d y(co) = U(D V ) .) 



V 



By Corollary 3, 



y (V) = sup {U(D) : S D cV} = U(D V ). 



[] 



Remark : Suppose B is a basis for a vector space. The assignment of 

values of a linear functional over a basis for its domain completely 
determines the functional. Hence, the assignment of utilities over a 
basis set will, under conditions of Theorem 1, determine the induced 
probability measure. Indeed, as is seen in the examples of the next 
section, the probability measure is determined by a U assignment to a 
class of functions generating V . 
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4 . Numerical Examples 



We now consider three examples of the construction of the induced 
probability measure. One example concerns a simple case where ft is 
finite, another involves a problem where ft is denumerable, and the 
last determines a probability measure for a continuous sample space ft. 

For the case where ft is finite, we use a horse race example, 
which is the setting for the pioneering paper by Anscombe and Aumann [1] . 
The example serves to provide insight into the construction of the sub- 
jective probability measure and, also, to show the correspondence between 
the notation of Anscombe and Aumann and that of this paper. 

Example 1 : Let ft = {h^,!^, . . . ,h^-} be the set of five horses running 

in a given race. We assume that the decision maker is given the chance 
to observe the odds (prizes) from the totalizator board as determined 
by the parimutuel betting. Suppose that the odds are as follows: 



Horse 


h l 


h 2 


h 3 


h 4 


h 5 


Odds 


1 to 1 


3 to 1 


7 to 1 


11 to 1 


23 to 1 



The decision maker has the option of betting on any of the five horses. 
Without loss of generality we assume that the decision maker has a total 
of $1 to wager. Consequently, the set of prizes is 

R * i r 0 =-1 » r i =1 » r 2 =3 » r 3 :=7 ’ r 4 =11 » r 5 !=23 ^ 

(He can bet on h^ and either win $1 or lose $1, or he can bet on 
h.£ and either win $3 or lose $1, etc.) Let R be the set of 
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probability distributions over R, and let r_^ be that distribution in 
* 

R degenerate at r^* We take the decision-maker T s utility for the 
* 

distributions r_^ to be the identity function normalized so that all 
utilities lie between 0 and 1. This gives 



* 

r 


* 

r o 


* 

r l 


* 

r 2 


* 

r 3 


* 

r 4 


* 

r 5 


uCr*) 


0 


1/12 


1/6 


1/3 


1/2 


1 



Let F be the set of all decision functions mapping Q to R (the 

class of "lotteries" over R ) which differ in the prizes received as 

a result of the outcome of the horse race. Define the lotteries 
* 

Fj : Q -> R as follows: 

r* if a) = h. 

3 3 

\ k 

r^ otherwise 

for j 0^ 1} 2 y . • . y 5. 

Let V = {uoF: FeF} and represent uoF^ by D^ , so 

( u(r*) if co = h 

V“> - 

( 0 otherwise. 

Taking the discrete topology over Q (the set of all subsets of Q) 
trivially ensures that the conditions of Theorem 1 are satisfied. We 
suppose the decision maker has expressed the following utilities for the 
gambles {D 1 ,D 2 ,D 3 ,D^ ,D 5 } : 



FjGo) = 
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D 


D 1 


°2 


°3 


°4 


D 5 


U(D) 


1 

32 


1 

16 


5 

144 


1 

24 


1 

16 



Now define the lotteries 



1 if oj = h. 

. | J 

D 4 (w) = 

0 otherwise 



By Corollary 4, p 4 = P[iu wins] = U(D^). Since U is linear and 



D . = ±- 

2 u(r*) 



i 

we have 



P j = U( V = 



The mass function induced over 0 , is therefore as shown in the table 
below. 



h 


bl 






h 4 


h 5 


P[h wins] 


3/8 


3/8 


5/48 


1/12 


1/16 



Observe that we were able to determine the decision maker’s probabilities 
from knowledge of his utilities of only the five lotteries D^, , D^, 

and . This is because these lotteries form a basis for all 

lotteries; that is, each DeP can be written as a linear combination of 
the "basis lotteries”. This is an important point since the decision 
maker is not required to state his utility for each lottery in P, but 
rather he need only make assignments to those in the basis. Once he 
states his utilities for the lotteries in a basis, we can determine his 
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subjective probability distribution over the outcomes of the horse race, 
and in turn calculate the utility of any other lottery in V using the 
expectation property of the utility function. This relieves the decision 
maker of having to evaluate complicated lotteries, between which he may 
be uncertain in his preferences, so as to yield values consistent with 
his more strongly held preferences. 



Example 2 : Let ft be the set of positive integers N. We interpret 

the outcome neft as the number of years hence until a total cure for a 

particular type of cancer is discovered. Let R - {rQ,r^} be a class 

of prizes where r^ and r^ are the prizes "no help" and "total cure", 

* 

respectively, and let R be the class of probability distributions over 

_ * * * 

R. Interpret r^ and r^ as the distributions in R degenerate at 

* 

r n and r. and, for 0 < a < 1, r is that distribution which gives 
0 1 a ° 

prize r^ with probability a and prize Tq with probability 1 - a. 

* 

We interpret the prize r^ as some progress somewhat short of a total 
cure, perhaps a treatment which reduces pain or which increases the 
patient’s lifetime. We suppose that the decision maker has utility a 
for prize r^CO^a^l); that is, 



, * 
u<v 



a 



Let V be the class of functions from ft into [0,1] (as before, 
we assume the discrete topology for ft so that all functions are continuous 
with compact support). In particular, let 

! 1 if a) = n 
0 otherwise. 
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The function corresponds to the case where no progress is made in 

the first n - 1 years and a total cure is found in the n t ^ L year. We 
suppose the decision maker assigns utilities to the functions D^, neN, 
as follows: 

U (D ) = kp n , 0 < p < 1 

where k is some proportionality constant. 

Now let p(n) be the probability that exactly n - 1 years pass 

before a total cure is discovered. Then, by Corollary 1, and the 

definition of D , 
n 

oo 

U(D n ) = kp n = l D n (a))p(cj) = p (n) . 

(0=1 

09 -1 
Since £ p(m) =1, we find that k = (l-p)/p and p(n) = (l-p)p U 

( 0=1 

Thus, the decision maker’s subjective probability distribution for the 
number of years that will elapse before a cure is found is geometric. 

He is therefore implicitly viewing the discovery of a cure in a given 
year as a Bernoulli trial with probability of success p. 

One could argue that the probability of success should not be 
constant from year to year, but rather an increasing function of n (as 
more knowledge is gained, the probability of success increases). Thus, 
the decision maker might want to re-evaluate his utilities when presented 
with his induced distribution. If he is content with the disclosure of 
his induced distribution, he can utilize this information to calculate 
utilities of more complex alternatives. For example, his utility of an 
arbitrary alternative DeV is found to be 
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CO 



U(D) = l D(u>)p(u) = I D(u)(l-p)p“. 

03=1 03=1 

Example 3 : Consider a ship maneuvering about in open sea in the presence 

of an enemy mine. The ship is equipped with a device which enables it to 

search the sea in a circular neighborhood for the location of the enemy 

mine. The success of the ship in locating the mine depends on the 

characteristics of the search device (as well as sea state, electromagnetic 

noise, etc.). We suppose that the ship is interested in maneuvering about 

within a radius of one unit from its present position. 

Let V be the class of continuous functions from the unit disc 

into the interval [0,1]. Using polar coordinates to describe points in 

the unit disc (ship position is taken as the origin) , one can interpret 

D(r,0), DeV> as a measure of the capability of the search device to 

detect a mine at the point (r,0). We will assume that the capabilities 

of the search device are independent of the bearing of the mine. Thus, 

we interpret D(r,0) as the probability that a mine at a distance of r 

£ 

units from the ship will be detected. In particular, let D g be a 
function in V which is one inside the circle of radius s and 0 
outside the circle of radius s + e. This is essentially a "cookie- 
cutter” prize function. With our interpretation D g represents a search 
device which is perfect for detecting a mine inside a circle of radius s 
and is useless for detecting a mine beyond s + c units. 

The ship’s Captain is asked to give his utilities for the search 

£ C 

devices D g , 0 £ s £ 1. Suppose that his utility of the device D g is 

2 c 2 

proportional to s + o(e); that is, U(D g ) = k* (s +o(e)) for some 
proportionality constant k. With this utility assignment, the Captain’s 
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utility for a search device which gives perfect information for a radius of 
r is proportional to the area of the circle of perfect information. 

Now for any s£[0,l], Corollary 1 gives 



U(D^) 



2tt 1 

» p 

D £ (r,0)f (r ,0)rdrd0 = k(s^+o(e)) 

0=0 r=0 



where f(r,0) is the bivariate probability density function induced by 
the utility function. Taking limits as e -* 0 gives 



lim U(D £ ) 
£->-0 S 



ks 



= lim 
£+0 



2tt 1 

• p 

0 0 



Dg(r ,0)f (r ,0)rdrd0 



2ir s 

| f(r,0)rdrd0. 

0 0 



Since the capabilities of the search device are independent of the bearing 
of the mine, the density function f(r,0) is constant with respect to 
0 ; that is , 



ks 



2 



2ir s 



s 



0 



rf(r,0)drd0 = 2it 

0 



rf R (r)dr. 

0 



Now, by differentiation, 

2ks = 2iTsf t) Cs) 

K 

which implies that f (s) = — and, since f is a probability density, 

K 7T K 

we find that k = tt. Thus, the Captain’s subjective density of the 
distance to the mine is uniform 
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( 1 0 £ r £ 1 

f R (r) « 

( 0 otherwise. 

Furthermore, the bivariate density of the location of the mine is uniform 
over the unit disc. 

5. Applications to Decision Theory 

The preceding general structure can be specialized to cover 
situations seemingly different from the examples discussed above. Here 
we consider statistical decision problems of the form (0,P,p) in 
Ferguson’s notation. 

Example 4 : Suppose the decision maker is confronted with two outwardly 

indistinguishable coins, one (say 0^) with probability of heads 

and the other (0^) with probability 1/3 of heads. He is asked to 

select one coin and is allowed to observe the outcome of one toss of it. 

The prizes he can get depend upon a second toss of the coin, as described 

below. Let T denote the value of 0e0 - {0^,0^} and xeX the value 

observed on the second toss of the chosen coin. Let the class of prizes 

R be [0,1] and u the identity function. Each function 6 = uoF 

from 0xX into [0,1] can be given as a four- tuple (a,b,c,d), where 

a is the u-utility of the prize won if (T,X)(<jo) = (0^,h), similarly, 

b is won if (0^,t) occurs, c if (0^^) and d if ( 02 >t). The 

class of continuous functions from 0xX into 10,1] with compact support 

4 

is thus identified with V = X [0,1]. Suppose the decision maker, having 

1=1 

observed the outcome X^(oj) = h on the selected coin, assigns U-utilities 
to the extreme points (of V) e_. = (6^ ,( ^3j proportional to 

j fahere 6 . . is the Kroneker delta) . 
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In order to make the measure y of Theorem 1 a probability 
measure it suffices to set U(l, 1,1,1) = 1, so y{(0^,h)} = 1/10, 
y{(0 1 ,t)} = 2/10, y{(0 2 ,h)} = 3/10, y{(0 2> t)} = 4/10. This is the 
conditional distribution of (T,X), given X^oj) = h, from which the 
conditional distribution v(- |h) of T given X^(ca) = h is v(0^|h) = 

£ y{(0^,x)} = 3/10; v(0 2 |h) = 7/10. The marginal distribution x of 
T has value at 0 proportional to v (0 | h) /f (h | 0) , where f(h|0) is 
the conditional distribution of X^ given T(to) = 0, evaluated at 
X^(oj) = h. Thus, since these masses summed over 0 must be unity, we have 

y k . v(9 |-h ) = k riZio + mol 
,L f(h|e) k L 1/2 1/ 3 J ’ 



1 = 



from which 



0e0 



xCep = 2/9; t(6 2 ) = 7/9. 



Similarly, the marginal distribution y of X is 



y(h) = p [ x=h] = + = y(0 
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The Bayes profit of a strategy 6 = (a,b,a,b), which depends only upon 
the second toss X of the coin, is ^ + “J • ~ ~ . If 

the outcome X^(w) = h is ignored (or if a coin is again chosen from 0) , 
the Bayes profit is [f + |] * f + [f + ^f\ * \ = ay(h) + by(t) . On the 
other hand, the Bayes profit of a strategy 6 = (a,a,b,b) depending only 

upon nature’s choice T of a parameter in 0 is, given X^(rn) = h, 

3 7 2 7 

a • Yq + b * Yq* whereas unconditionally it is a • -g + b • -g . 
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7 * 

Example 5 : We consider next a simple problem (0,A ,L) , based upon an 

example discussed by Chernoff and Moses [2]. Suppose nature’s choice 
for today’s weather is made from 0 = {0^(rain) ^^Cshine) } . We must 
decide whether to take a raincoat (a^) or not . Suppose the 

loss structure is as follows: 




& * 

Let ft = 0, R = (0xA) and u 


be 1 - L on 


(0xA) , 


is extended by expectation from 


its values on 


0xA: 


\a 

e\ 


a l a 2 




e i 


2/3 0 




e 2 


1/3 1 






u ( 6 , a) 




Consider two rules 6^ and 6 2 


in V as follows: 


(1 if e = 


: e i 


1 if 


«!«> * 


; 5 (e) = ' 




( o if e - 


>e 2 


0 if 



Suppose the decision maker assigns 

UCfip = 2/3, U(6 2 ) = 1/3 
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Then by Corollary 1, 



1 

3 



U(6 2 ) = 



6 1 (e)dT(0) = x({0 2 }) ; 



similarly x({0^}) = 2/3. Since t(0) = 1, no further normalization is 
needed and the decision maker’s subjective prior of rain is 2/3. The 
U-utility of any rule 



la if 0=0^ 

6 ( 0 ) = < 

(3 if 0 = 0 2 

is U(6) = 2a/3 + 3/3, the Bayes utility of 6. 



Example 6 : As in the horse racing example, with the problem described in 

Example 5 it might have been easier for the decision maker to assign 
U-utilities initially to rules 6 which correspond to uoF with F(0^) 
a distribution assigning unit mass over points in {0^}xA. This is 
because if he knew nature’s choice were 0. he would want to consider 



only those prizes of the form (0^,a^). Thus it might be relatively easy 
for him to ’’value”, in the U-utility sense, 6’s associated with F’s 
mapping 0 _^ into ({0^}xA)*; i = 1 , 2. For example, if it were known 
that 0 2 (shine) was nature’s choice, the most desirable rule corresponds 
to taking action a 2 (no coat) , which in turn might reasonably correspond 
to the rule u°F 2 , where F 2 (0^) de 8 enerate at (0^>a 2 ) and F 2^2^ 

is degenerate at (© 2 ,a 2 ). But this is precisely the rule 6 ^ of Example 
5. Similarly, if one knew 0^ were chosen, the rule 6^ = u°F in which 

F^(0^) is de § enerate at ^ 0 i ,a l^ F 3^ 0 2^ is degenerate at ^ 0 2 ,a 2^ 

is most desirable. Clearly, 
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6 3 (0) = 



2/3 if e = e 1 
1/3 if e = e 2 

Imagine the decision maker assigns U( 6 3 ) = 5/9 and 11 ( 62 ) = 1/3 (as 
before). It follows that the subjective prior is the solution t(0^), 
1 ( 62 ) to the system 

U( 6 3 ) = 5/9 = 2/3 1(0^ + 1/3 t(© 2 ) 

U( 6 2 ) =1/3 = 0* t( 6 1 ) + 1 • t( 6 2 ), 

giving t( 0 ^) = 2/3, t(© 2 ) = 1/3 (normalization is again unnecessary 
in this case) . 

Example 7 : Consider next the game described above in which we can 

observe the outcome on a random variable X (weather forecast) with 

sample space X = {x^ = rain, X 2 = shine). Suppose it is known that 

P[X=Xjj0^] = 3/4 and PlX=x^| 02 ) = 1/5. We now have the statistical 

decision problem (0,D,p). Let Q, = ( 0 xX) , R = 0xA and u = 1 - L 

A 

extended by expectation to (0xA) , as before. Let = ( 0 ^,x^), 

(^2 = ^®l ,x 2^’ = ( 02 »x^) and = (© 2 ^ 2 ) • Members F of F can 

' be represented as vectors (P^ *^ 2 »? 3 >^ 4 ) l n which P^ = F(w^) is a 
mass function over 0 xA. For the moment, restrict attention to 
F(( 6 ^>Xj)) that allocate their total mass to points in { 0 ^}xA. In 
this case, F maps points in { 0 ^}xX to distributions over { 0 ^}xA, 
so we may regard F as a mapping from X to A , that is, a behavioral 
decision rule. For such F, the corresponding 6 (represented by 
(d^,d 2 »d 3 ,d^) where 6 (oj^) = d^) may have first two components in 
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[0,2/3] and second two components in [1/3,1]. Example rules In V are 



the least desirable: 6^ = (0,0, 1/3, 1/3) 

take action a^: 6 ^ = (2/3 ,2/3 ,1/3 ,1/3) 
take action a^: <$2 ~ ( 0 , 0 , 1 , 1 ) 

follow the forecast ( a | Ex ^) : 6 ^ = (2/3 ,0,1/3 , 1 ) 

contradict the forecast (a^x^): 6 ^ = (0,2/3 ,1,1/3) 

knowledge of nature’s choice (a^= 0 ^) : 6 <- = (2/3 ,2/3 , 1 , 1 ) . 

It is easily seen that 6 ^, 63 and <$4 correspond to extreme points 

of the risk set S corresponding to the randomized rules in V , and 
that points corresponding to 6 ^ and 6 ^, as we H as the segment 
between, are in the lower boundary of S. Suppose the decision maker 
assigns equal U-utilities to 6 ^ and so the line segment repre- 

senting randomizations between 6 ^ and 6 ^ will be in a U contour. 

Then it should be the case that the induced subjective prior is the 

26 

least favorable prior, since the maximin utility ( 1 -yy) is achieved 
by such a rule (i.e., the minimax loss is attained by a Bayes rule with 
respect to the least favorable prior; the minimax rule is Bayes for that 
prior). Accordingly, suppose U( 6 ^) = 11 ( 62 ). Since 

u(6 1 ) = x(e 1 ) • | + x(e 2 ) • f , 
u(6 2 ) = xCep • f + x(e 2 ) • , 

it follows that x( 6 ^) = 16/21 and x(0 2 ) = 5/21 which is indeed the 
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least favorable prior. As a check, 
37 

minimax rule is -g-j , which is 1 - 



the corresponding 
(minimax loss) . 



U-utility of the 



6 . Conclusions 

We have shown the existence of a unique probability measure induced 
by a decision maker* s preferences or utilities over a set of alternatives. 
Furthermore, we have shown how that probability measure can actually be 
constructed. Our results have been obtained for a very general structure 
on the decision problem. We require the standard conditions for the 
existence of the utility functions, and we require that the sample space 
ft be a locally compact Hausdorff space. This is a fairly weak restriction 
on ft admitting, for example, 



(1) all countable spaces with the discrete topology, 

(2) all intervals on the real line with the standard 

Euclidean topology, 

(3) n-dimensional Euclidean space with the standard 
topology, and 

(4) the complex plane. 

We have also required that the class F of prize functions be exactly the 

* 

class of all continuous functions from ft into R with compact support 
(and hence the class of decision functions V is the class of all continuous 
functions from ft into [0,1] with compact support). This can be a very 
large class of functions, but we noted that the decision maker need not 

express his utilities for the entire class V , but only for a subset 

which generates the class. For the case where ft is countable, every 
function D: ft-*[0,l] is continuous with the discrete topology so that 
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our results apply to the countable case with no additional restrictions. 

In addition to extending the previous results of Anscombe and Aumann [1] 
to more general sample spaces and for larger classes of decision functions, 
we relax the assumption of the monotone relationship between the two 
utility functions U and u (we show that the relationship does follow) . 

One may argue that, if the decision maker possesses the ability to 
make arbitrarily fine judgment discriminations as is implied by the con- 
sistency and rationality axioms of utility theory, knowledge of the imputed 
probability distribution can add no additional information. However, what 
knowledge of the probability distribution can do is enable the decision 
maker to concentrate on a small subset of relatively simple decision alter- 
natives. Once his utility evaluations for this set of alternatives are 
determined, his subjective probability distribution can be extracted and 
applied to evaluate the more complicated and uncertain alternatives so as 
to agree with his original assessments. Furthermore, the probability dis- 
tribution offers feedback to the decision maker useful for checking his 
utility assessments, and it provides him a method of communicating his 
personal feelings about the unknown state of nature. 
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